The purpose of this assignment is to assess the equity implications of air quality in San Mateo County (SMC). This study will use PurpleAir data as the basis for analysis. Geographic, population, and data equity will be reviewed, and will be displayed through interactive dashboards.
The first part of this study will review geographic equity for PurpleAir sensors in SMC for 4 weeks in February 2022. First, the sensor location data for SMC was obtained from the PurpleAir website. After filtering these down the SMC, PM2.5 and AQI was calculated for each sensor. The distribution of sensors and their AQI classification based on their PM2.5 can be seen below:
From here, the ThingSpeak data was collected for February 2022. Due to the large computational intensity of doing the entire SMC, Redwood City, Menlo Park, Burlingame, Mibrae, San Bruno, San Carlos, and San Mateo were selected and make up a bulk of the county. Given the sensor locations, it is important to know the boundaries where air quality data from that sensor should be used. This was accomplished using the voronoi method, which finds the boundary where that zone is closest to each sensor.One key assumption is that only outdoor sensors are applicable, as indoor sensors could be more localised air quality. In addition, an assumption with the effective sensor area is that air can travel freely, which more so occurs outdoors. The voronoi boundaries can be seen below:
Next, the census block group (CGB) air quality data was calculated. This was done by using the voronoi boundaries and finding the spatial intersection with the CBGs. Ultimately, this uses a weighted mean to calculate the PM2.5 for each CBG. In order to create a more interactive interface, a shiny dashboard was used to display this data. This allows the user to select between different Jurisdictions, and see the daily air quality data, and see the CBG distribution on a map. See the dashboard through the following link:
The next part of this study will focus on a population equity analysis. In particular, it will look at income and race data compared to the PM2.5 levels at the CBG and block level to determine any under- or over-represented populations. For the income analysis, it will utilise ACS income data at the CBG level. For simplicity, only the past week of PurpleAir data will be used. An important assumption is that while the indoor sensors used in this analysis are specific to a household, but it is assumed it applies to each house in that CBG and a consistent race distribution. On the other hand, the race analysis will use Decennial data at the block level to produce an equity analysis chart against PM2.5 levels.
First, the voronoi boundaries had to be determined at the CBG and block level so that the air quality data could be aggregated to those levels. Once again, spatial intersection was used to determine the PM2.5 levels for each geographic boundary. The respective plots for the income and race equity analyses can be seen below. However, the following link also uses an interactive dashboard to display the data.
https://agkerr.shinyapps.io/AlessandroKerr_A5_dashboard2/
The last part of this analysis will consider data equity, and if certain areas are underrepresented in terms of sensor locations. Ultimately, a weighted score will be given to each CBG based on racial, population density, and coverage area. It was assumed that each sensor can be provide air quality data for a 400m (1/4 mile) diameter, and only outdoor sensors are applicable to this (so that air can travel freely in a 400m circle). The census boundaries, sensor locations, and coverage area can be seen in the maps below. From here, the percent area coverage, population density, and racial distribution was calculated for each CBG. Then, the score can be determined for each CBG with user-input on a dashboard. Given the lowest score, it recommends a location to place a new sensor. See the dashboard through the following link: